A Pattern Search Method for Model Selection of Support Vector Regression

نویسندگان

  • Michinari Momma
  • Kristin P. Bennett
چکیده

We develop a fully-automated pattern search methodology for model selection of support vector machines (SVMs) for regression and classification. Pattern search (PS) is a derivative-free optimization method suitable for low-dimensional optimization problems for which it is difficult or impossible to calculate derivatives. This methodology was motivated by an application in drug design in which regression models are constructed based on a few high-dimensional exemplars. Automatic model selection in such underdetermined problems is essential to avoid overfitting and overestimates of generalization capability caused by selecting parameters based on testing results. We focus on SVM model selection for regression based on leave-one-out (LOO) and cross-validated estimates of mean squared error, but the search strategy is applicable to any model criterion. Because the resulting error surface produces an extremely noisy map of the model quality with many local minima, the resulting generalization capacity of any single local optimal model illustrates high variance. Thus several locally optimal SVMmodels are generated and then bagged or averaged to produce the final SVM. This strategy of pattern search combined with model averaging has proven to be very effective on benchmark tests and in high-variance drug design domains with high potential of overfitting. ∗This work is supported by the NSF Grants IIS-9979860 and IRI-97092306. †Dept. of Decision Sciences and Engineering Systems, Rensselaer Polytechnic Institute, Troy, NY 12180, [email protected] ‡Dept. of Mathematical Sciences, Rensselaer Polytechnic Institute, Troy, NY 12180, [email protected]

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Ensemble Kernel Learning Model for Prediction of Time Series Based on the Support Vector Regression and Meta Heuristic Search

In this paper, a method for predicting time series is presented. Time series prediction is a process which predicted future system values based on information obtained from past and present data points. Time series prediction models are widely used in various fields of engineering, economics, etc. The main purpose of using different models for time series prediction is to make the forecast with...

متن کامل

Development of a Pharmacogenomics Model based on Support Vector Regression with Optimal Features Selection Approach to Determine the Initial Therapeutic Dose of Warfarin Anticoagulant Drug

Introduction: Using artificial intelligence tools in pharmacogenomics is one of the latest bioinformatics research fields. One of the most important drugs that determining its initial therapeutic dose is difficult is the anticoagulant warfarin. Warfarin is an oral anticoagulant that, due to its narrow therapeutic window and complex interrelationships of individual factors, the selection of its ...

متن کامل

Development of a Pharmacogenomics Model based on Support Vector Regression with Optimal Features Selection Approach to Determine the Initial Therapeutic Dose of Warfarin Anticoagulant Drug

Introduction: Using artificial intelligence tools in pharmacogenomics is one of the latest bioinformatics research fields. One of the most important drugs that determining its initial therapeutic dose is difficult is the anticoagulant warfarin. Warfarin is an oral anticoagulant that, due to its narrow therapeutic window and complex interrelationships of individual factors, the selection of its ...

متن کامل

Applying Combined Approach of Sequential Floating Forward Selection and Support Vector Machine to Predict Financial Distress of Listed Companies in Tehran Stock Exchange Market

Objective: Nowadays, financial distress prediction is one of the most important research issues in the field of risk management that has always been interesting to banks, companies, corporations, managers and investors. The main objective of this study is to develop a high performance predictive model and to compare the results with other commonly used models in financial distress prediction M...

متن کامل

The Porosity Prediction of One of Iran South Oil Field Carbonate Reservoirs Using Support Vector Regression

Porosity is considered as an important petrophysical parameter in characterizing reservoirs, calculating in-situ oil reserves, and production evaluation. Nowadays, using intelligent techniques has become a popular method for porosity estimation. Support vector machine (SVM) a new intelligent method with a great generalization potential of modeling non-linear relationships has been introduced fo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002